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A MATHEMATICAL TREATMENT OF SOME 
BIOLOGICAL PROBLEMS. 

SHINKISHI HATAI, 

Associate in Neurology, The Wistar Institute of Anatomy. 

Dr. King's work (07, '09), " On Studies on Sex-determination 
in Amphibians," suggests an interesting problem which may be 
put in the following form : 

A jar contains a large number of male and female tadpoles, the 
proportion of each being unknown : if on picking out m + n tad- 
poles, m are found to be males and n females, to find the prob- 
ability that the ratio of the number of either sex to the entire lot 
lies between given limits. 

It will be seen that problems of this nature occur frequently 
in biological investigations and that it is of importance to have 
some method for determining the accuracy of the observed pro- 
portions. This can be done by means of the formula given below. 
As the development of the formula is somewhat complicated I 
shall present the entire process of the mathematical treatment of 
the solutions based on the theorem of Bayes, together with one 
application. Although such an elementary exposition of the sub- 
ject will be superfluous for one who is familiar with the theory of 
probabilities nevertheless for others it may be helpful. 

If / denote the probability that an event will happen, then 
(1 — p) is the probability that the event will fail. If the proba- 
bility that the event will fail on any single trial is (1 — p), the 
probability that it will fail every time is (1 —p) n . The proba- 
bility that it will happen on the first trial and fail on the succeed- 
ing n — 1 trial is/(i —p) n ~ l . But the event is just as likely to 
happen on the second, third, etc., trials as on the first. Hence 
the probability that the event will happen just once in the n 
trials is 

np(i-py~\ 

Continuing this process, we can easily see that the probability 
that it will happen m times in m + n trials is 
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\m + n 

-P m {i -PY- (i) 



From (i) we can obtain directly the probability that exactly m 
males and n females occur in m + n drawings and the expression 
will be 

\m + n 

p = l t~\ — x i l ~ X T ( 2 ) 

where x denotes males and ( I — x) females. 
If we call the right hand member of (2) y, then 

y = cx m (i —xf 

and this equation represents a curve with a zero ordinate when 
x=o and x= 1. Since the elementary area under the curve is 

\m + n 

'-T-, X m (\-Xfdx, 

\m\n v ' 

the number of cases which occur between the two limits (a, b) is 
proportional to the sum of the differentials taken from a to b, or 



\m + n 



I x m (i — xfdx. 

%J a 



Also the totality of cases will be proportional to the sum of the 
differentials taken from o to 1, or 

\m + n f" 

- ,--, — x m (i —xfdx. 

\>u\n_ Jo 

Hence the probability that the ratio of the males in the jar to 
the entire lot lies between the two limits is 

f x m ( 1 — xfdx 

/-4 (3) 

I x m (i — xfdx 

This is the well-known theorem of Bayes. 1 This equation as it 

1 See Todhunter's " History of the Theory of Probability from the Time of Pascal 
to that of Laplace," 1865. 
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stands is applicable to cases where the number of observations 
is small. For cases where the number of observations is large 
we must modify this still further. 

If we suppose the two limits to be m/s ± where s = m + n, 
then equation (3) may be written in the following form : 



P = 



I x w (i — x) n a 

*sm/8—$ 



J x m {\ -x)"dx 



By successive integration by parts the denominator is evaluated, 
giving 

C' , v , ■ mini 

f x m (i-xYdx= / ■ -. (4) 

Jo (m + n + i)\ w 

If we let x = m/s + z the numerator becomes 

L OH"-)'* 

which is approximately 



m m n n r +e - — 

2mn 



m m n" T + 
s" J-e 



dz. 



s 
With the above transformation the formula becomes 



/ = 



(s + 1)! m m n n /' lfl sVJ 
in Inl s" 



I e~*™dz. (5) 

J -6 



Now if we apply Stirling's formula for large numbers (4) 
becomes 



(s+i)l_ s* I J 3 
mini m m n" N 2~ inn 



Therefore (5) may be written in the following form : 



t+e sw 



p= LiL_ I e 2mn dz 
S 2zmn J-e 
If we assume 



1^" 

2 = t 

Si 



1 2mn 
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and substitute co for 



e LiL 
\2mn 

then as a final form we have 



,-^JV* 



Since the above expression is a well-known equation for the 
probability integral, the degree of probability for any given limits 
may be determined from the table. For computation we have 
the following relation : 



co = +1 or d = co I 

\ 2tnn \ 

and / = f ( w )- 



2tnn 

7"' 



As will be seen from the above relations, the degree of prob- 
ability is proportional to the amount of deviation. Therefore in 
any given case we can either increase or decrease the value of 
probability by changing the value of the deviation. Thus in 
order to facilitate a comparison of several sets of data, it would 
be advantageous to fix the value of the probability and then 
determine the corresponding amount of the deviation. If we take 
0.75 for the value of the probability, it will be certainly high 
enough for practical purposes. If we adopt this system, then the 
corresponding value of 10 is fixed and is equal to 0.814. Thus 
we do not even need to use the table to determine the amount of 
deviation. If however one wishes to find any other value of the 
probability than I have proposed (i. e., 0.75), such can.be readily 
obtained from the table. Thus the determination of the deviation 
can be made by a simple arithmetical process. 

The following will illustrate a method of determination. Dr. 
King has kindly supplied the data for this purpose and I wish to 
thank her for that material. 

Example: Out of 16, 100 tadpoles, 9,949 are examined. 5,136 
are found to be females and 4,8 1 3 males. Find the probability 
that the ratio of females to the entire lot lies between given limits. 
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We have 

Since we adopted the fixed value for probability (0.75) the 
value of co is also fixed and equal to 0.814. Thus our problem 
is therefore to find a value of d when the value of p is known. 



■-0.814J 



2 x 4,813 x 5,136 . 
— — — ^— = 0.814 x 0.00709 = 0.0058. 

99,493 



Deviation = 16,100 x 0.0058 = ± 93. 

Total number of females = 16,100 x 5,136/9,949 = 8,311 

±93- 

With this value of probability we conclude that the total 

number of females in the entire lot is neither greater than 8,404 

nor less than 8,2 18. 



